Goto

Collaborating Authors

 fine-grained visual attribute dataset


Supplementary Materials: FiV A: Fine-grained Visual Attribute Dataset for T ext-to-Image Diffusion Models

Neural Information Processing Systems

Section A. We then introduce additional details on dataset construction in Section B. Further, we Finally, we discuss the limitations and future work of the project in Section D. Please also find the Details on attribute taxonomy and statistics. We visualize the rough distribution of visual attributes and subjects on the left. We also visualize the attribute alignment accuracy via human validation here. Due to space limitations, only 15 sub-subjects are listed for each major-subject. The result shows that Image 4 exhibits inconsistencies, with the reasons provided.


FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models

Neural Information Processing Systems

Recent advances in text-to-image generation have enabled the creation of high-quality images with diverse applications. However, accurately describing desired visual attributes can be challenging, especially for non-experts in art and photography. An intuitive solution involves adopting favorable attributes from source images. Current methods attempt to distill identity and style from source images. However, "style" is a broad concept that includes texture, color, and artistic elements, but does not cover other important attributes like lighting and dynamics.